Endoscopic ultrasound-guided side-fenestrated needle biopsy sampling is sensitive for pancreatic neuroendocrine tumors but inadequate for tumor grading: a prospective study

Accurate pretreatment grading of pancreatic neuroendocrine tumors (PanNETs) is important to guide patient management. We aimed to evaluate endoscopic ultrasound-guided fine needle biopsy sampling (EUS-FNB) for the preoperative diagnosis and grading of PanNETs. In a tertiary-center setting, patients with suspected PanNETs were prospectively subjected to 22-gauge, reverse-bevel EUS-FNB. The EUS-FNB samples (Ki-67EUS) and corresponding surgical specimens (Ki-67SURG) were analyzed with Ki-67 indexing and thereafter tumor grading, (GRADEEUS) and (GRADESURG) respectively. In total 52 PanNET-patients [median age: 66 years; females: 25/52; surgical resection 22/52 (42%)] were included. EUS-FNB was diagnostic in 44/52 (85%). In 42 available FNB-slides, the median neoplastic cell count was 1034 (IQR: 504–3667) with 32/42 (76%), 22/42 (52%), and 14/42 (33%) cases exceeding 500, 1000, and 2000 neoplastic cells respectively. Ki-67SURG was significantly higher compared to Ki-67EUS with a moderate correlation comparing Ki-67EUS and Ki-67SURG (Pearson r = 0.60, r2 = 0.36, p = 0.011). The GRADEEUS had a weak level of agreement (κ = 0.08) compared with GRADESURG. Only 2/12 (17%) G2-tumors were correctly graded in EUS-FNB-samples. EUS-guided fine needle biopsy sampling is sensitive for preoperative diagnosis of PanNET but biopsy quality is relatively poor. Therefore, the approach seems suboptimal for pretreatment grading of PanNET.

The EUS-procedure. A linear echoendoscope (EG3870UTK, Pentax, Japan) and an ultrasound processor (HI VISON Ascendus, Hitachi, Tokyo, Japan) were used to examine the patients under deep sedation. The characteristics of target lesions were recorded. Before sampling, the echoendoscope was stabilized in the stomach or in the duodenum. Then, transmural puncture of the target lesion was performed by EUS-FNB using a 22 gauge reverse-bevel needle (EchoTip Procore®, Wilson-Cook Medical, Limerick, Ireland) and by applying fanning and standard suction 21 . All EUS-procedures of the study were performed by either of two dedicated and experienced endosonographers (> 1000 procedures).
The yield of EUS-FNB was put into formalin tubes and the FNB-core was assessed macroscopically. Additional FNB-passes were performed if the cores were considered inadequate at gross examination. No fixed number of passes was performed. Routine EUS-FNA (EchoTip®, Wilson-Cook Medical), and not EUS-FNB, was preferred during some periods when diagnostics was performed by subspecialized cytopathologists or if no FNB-needle was available on-site.
Histopathology. First, FNB-core biopsy samples were formalin-fixed and paraffin-embedded (FFPE) as per standard protocols. Sections (3-4 µm) were placed on positively charged glass slides and antigen retrieval performed using the Dako PT-Link system, using EnVision™ FLEX Target Retrieval Solution (TRS High). Samples were routinely stained with hematoxylin-eosin and immunohistochemistry was performed using the Dako Autostainer Link using EnVision™ FLEX according to the manufacturer's instructions (DakoCytomation).
The FNB-samples were regarded diagnostic for PanNET only if cytomorphology and immunohistochemistry [positive staining for chromogranine A (CGA) and synaptophysin (SYN)] were consistent with the diagnosis. Else, samples were regarded non-diagnostic for PanNET.  www.nature.com/scientificreports/ Areas of interest (AOI) for digital quantification of the EUS-FNB samples were defined manually to exclude regions with appreciable presence of lymphocytes. Only groups of more than five neoplastic cells were selected, to avoid counting other cell types such as lymphocytes, granulocytes, fibroblasts or ordinary pancreatic glandular cells that could not be unambiguously differentiated from dissociated neoplastic cells. If a sufficient number of neoplastic cells were not present in the first AOI, a secondary hotspot was located and an additional AOI was manually defined to obtain a maximum amount of neoplastic cells counted (preferably 2000 cells). AOI in the resection specimens were defined as a circular region with a diameter of 500 µm drawn around the region of the highest fraction of Ki67 positivity (proliferative hotspot) determined in the 10 × magnification of the app.

Quantification of neoplastic cells and Ki
Digital quantification of the total number of neoplastic cells in the AOIs and the fraction of Ki67 positive cell nuclei was performed using the #10143 Ki-67 Neuroendocrine Neoplasm app for Oncotopix from Visiopharm, using original app settings as provided by the manufacturer. For EUS-FNB samples, the total number of neoplastic cells was counted on Ki-67 slides and documented for the three largest groups of neoplastic cells (Cell count EUS ). For comparison, the total number of neoplastic cells was also counted in the ten largest groups of neoplastic cells.
Finally, the Ki-67 Index (%) of EUS-FNB samples (Ki-67 EUS ) was determined by dividing the number of positive cells by the total number of cells counted in the three largest groups of neoplastic cells. Based on the Ki-67 EUS , the tumor grade (GRADE EUS ) was estimated. For resection specimens, areas of non-neoplastic tissue (e.g. connective tissue and erythrocytes) were digitally removed by the tumor detection-algorithm of the app. DAB-positivity was scored as above and a minimum of 2000 neoplastic cells were counted in every specimen.
The Ki-67 Index (%) of resection specimens (Ki-67 SURG ) was determined by dividing the number of positive cells by the total number of cells counted. Based on the Ki-67 SURG , the tumor grade (GRADE SURG ) was determined.
Regarding both EUS-FNB samples and resection specimens, the Ki-67 Index (%) determined as per standard diagnostic practice (manual cell counting in microscope and on printed screenshots) were available for comparison. The pathologist performing the quantification of the resection specimens was blinded to the quantification of EUS-FNB samples.
The WHO classification of 2017 4 was applied for tumor grading. In study cases handled before 2017, the WHO classification of 2010 22,23 was applied but with a similar and pragmatic management as the 2017 version regarding the distinction between G1 and G2 of tumors with a Ki-67 Index above 2 but less than 3, Table 2.
Clinical follow up including patient management and surgery. All study patients were monitored via clinical follow-up for a minimum of six months and, if needed, for an extended period at least until the final diagnosis was established. The management of the study subjects was determined at the local multi-disciplinary therapy conference based on international guidelines 4,6,24 . Final diagnosis was based on the surgical specimen. In patients not subjected to surgery, the combination of clinical follow-up including biochemistry, radiology, somatostatin receptor imaging, and any other sampling modality for pathology was used as the reference standard. Cases with an unclear final diagnosis were not regarded as PanNET.
Study outcomes. The outcomes of this study were: -The sensitivity of EUS-FNB for PanNET. -The EUS-FNB-biopsy quality, i.e. the Cell Count EUS .
The cut-off level for a poor, fair, good, and excellent biopsy quality was set at < 500, 500-1000, 1000-2000, and > 2000 neoplastic cells respectively. All non-diagnostic EUS-FNB samples were per definition < 500 neoplastic cells. Statistical analysis. Descriptive, continuous data were described as median and interquartile range (IQR), while descriptive, categorical data were described as frequencies.
In the calculation of the diagnostic sensitivity of EUS-FNB, an intention-to-diagnose analysis was performed.
Fisher's exact test was used in the proportional analysis of any factors with a potential impact on the sampling yield and biopsy quality of EUS-FNB.
Pearson's test was used to calculate the correlation coefficient (r-value) between the Ki-67 Index as in EUS-FNB samples and as in surgical specimens. Additionally, Wilcoxon signed rank test was used to identify any difference in the Ki-67 Index comparing EUS-FNB samples and their corresponding surgical specimens.
Cohen's kappa value was calculated to describe the level of agreement comparing PanNET grading in EUS-FNB samples and PanNET grading in the corresponding surgical specimens. Both an intention-to-diagnose analysis and a per-protocol analysis was performed.
A p-value of < 0.05 was considered statistically significant in all analyses. The statistical calculations and tests were performed using IBM SPSS Statistics version 25.0.
No tested factor had a significant impact on the biopsy yield and quality in EUS-FNB samples, Table 3.

Accuracy of Ki-67-indexing and grading of PanNET in EUS-FNB samples.
Quantification and assessment of the Ki-67 Index in EUS-FNB samples (Ki-67 EUS ) was according to Table 2.
In the 17 cases which were subjected to surgery (#34 not included due to staining artifacts), there was only a moderate correlation comparing the Ki-67 EUS and the Ki-67 SURG (Pearson r = 0.60, r 2 = 0.36, p = 0.011). The Ki-67 SURG was significantly higher compared the Ki-67 EUS , Fig. 2.
Based on the Ki-67 EUS , pretreatment tumor grading of the study PanNETs (GRADE EUS ) was according to Table 2. The GRADE EUS was found to have a weak level of agreement as compared with the tumor grade in the corresponding resection specimens (GRADE SURG ), Table 4. In the intention-to-treat analysis, only 2/12 (17%) tumors graded as G2 in surgical specimens (GRADE SURG ) were indeed correctly graded as G2 also in EUS-FNB samples (GRADE EUS ). Analyzing only the cases with a Cell Count EUS > 1000 cells (n = 7: G1 n = 4; G2 n = 3), still 2/3 (67%) G2-tumors (GRADE SURG ) were graded as G1-tumors in FNB biopsy samples (GRADE EUS ).
For comparison, the quantification of neoplastic cells and assessment of the Ki-67 Index in the ten largest groups of neoplastic cells in the EUS-FNB samples did not result in a significant change in outcome, Supplementary Table 1 in Supplementary Materials.

Discussion
In this prospective study we have shown that EUS-guided fine needle biopsy sampling, using a 22-gauge reverse bevel needle, is sensitive for the diagnosis of pancreatic neuroendocrine tumors and opens up for pretreatment assessment of the tumor proliferation rate. However, the FNB-yield was sparse, which in many cases lead to a falsely low estimation of the Ki-67 Index as compared with the Ki-67 Index in resection specimens. Thereby, the investigated approach resulted in a high rate of tumor under-grading with an apparent risk of incorrectly classifying true PanNET G2-tumors as G1-tumors at the preoperative stage.
Certainly, in non-functioning PanNETs, tumors of larger size have an increased likelihood of being of higher tumor grade as compared with smaller tumors 25 . Moreover, both MRI and EUS can determine tumor size with high precision 26 . Nevertheless, tumor size alone cannot act as a reliable surrogate marker of the tumor grade in PanNETs 27 . Therefore, an early and reliable estimation of the Ki-67 index in pretreatment PanNET tumor tissue would be highly valuable to estimate the tumor grade and thereby facilitate the decision on further clinical management and surgery.
According to the presented results, the 22-gauge reverse bevel FNB-needle seems to be appropriate in the EUS-based diagnosis of suspected PanNETs as such. The recorded sensitivity for PanNET of around 85% is comparable to findings in other studies investigating either the sensitivity of EUS-FNA 16,28 or that of EUS-FNB 29,30 . In a very recent publication by Crino et al. including a high number of patients subjected to EUS-FNB (n = 231), the true diagnostic sensitivity of EUS-FNB cannot be estimated given the retrospective design of the study and the method applied for selection of cases to be included in the study 31 .
Relatively few studies have focused on factors with a potential impact on the sensitivity and the sampling yield of EUS-FNA/FNB in PanNET. We noticed a tendency, admittedly without a significant p-value, that EUS-FNB had a higher sensitivity in large (> 20 mm) tumors as compared with small (< 20 mm) tumors. Apart from that finding, we detected no factors associated with sufficient yield. In one study by Hijioka and colleagues, analyzing the yield of EUS-FNA, tumor location in the pancreatic head and heavy tumor fibrosis were found to be negative factors, while no other tested factors were associated with poor yield 32 . In our study, we recorded a numerically, but not significantly, higher rate of FNB-samples with a neoplastic cell count > 1000 cells in tumors located in the pancreatic head as compared with in the body-tail.
In the presented study cohort, the FNB-biopsy quality, i.e. the amount of neoplastic cells acquired from the target tumor, was often relatively poor. A vast majority of samples had a neoplastic cell count < 2000 cells and not few samples a cell count < 500 cells. It could be hypothesized that a high number of needle passes would increase the likelihood of a good FNB core with a high cell count. Although not statistically proven in our data set, we noticed a trend that that a higher number of FNB-needle passes lead to an increased rate of samples containing a minimum of 1000 neoplastic cells, Table 3. Accordingly, we suggest that endosonographers aim for at least 3 needle passes if using the reverse bevel needle in suspected PanNET. In the current study, gross examination of the FNB-core was performed to reassure adequate yield but also to avoid excessive needle passes. However, and according to a recent report, it might be that on-site evaluation of samples is of no significant benefit 33 .
The side-fenestrated FNB-needle has been shown to be significantly more often associated with a low cell count (< 500 cells) as compared with the end-cutting FNB-needles 31  www.nature.com/scientificreports/ has been a growing body of scientific support in favor of end-cutting instead of side-fenestrated FNB-needles in [34][35][36][37][38] . Based on the results presented by us and by others in the aforementioned studies, we suggest and believe that the side-fenestrated design of the FNB-needle used in the current study is an important negative factor in the explanation of the imperfect FNB-yield. When the current study was designed, there was a lack of data and   www.nature.com/scientificreports/ evidence on this topic. Since the endosonographers engaged in the current study had long experience of EUS, the competence of the endosonographer was much less likely a factor of importance. Apparently, the Ki-67 Index in FNB-samples acquired with a 22-gauge side-fenestrated needle is unreliable. Multiple factors may account for this finding. First, in any sampling modality including EUS-FNB, there is an evident risk of sampling error irrespective of what technique or needle used, since tumor heterogeneity is significant and tumor hot spots with a high proliferation rate can be missed at sampling 39 . Grillo and co-workers estimated that it might be required to use a large 18 gauge-needle and the procurement of a 15 mm biopsy core to obtain a relatively reliable sample for grading of G2-tumor 40 . The acquisition of such a large sample by the use of EUS is quite demanding irrespective of the needle used. Second, the tumor microarchitecture of PanNETs could be non-favorable with respect to sampling via EUS-FNB. Third, and as hypothesized above, the construction and design of the reverse-bevel FNB-needle investigated in this study is likely less well adapted for sampling of PanNETs as compared with end-cutting EUS-FNB-needles 38 , which, most probably, should be used firsthand. Less likely, the experienced endosonographers engaged in the current study is an explanation for poor FNB-yield.
Significant efforts have been performed by others with the intent to assess the validity of pretreatment tumor grading of PanNETs by the use of samples acquired by EUS-guided sampling 41 . In a recent, retrospective publication, Leeds  Similarly, in a retrospective study analyzing 33 cases with PanNET Hwang et al. recorded that the Ki-67 Index in EUS-FNB samples was significantly lower than in the corresponding surgical specimens. The reverse bevel FNB-needle was used in the study but needles of various sizes (19/22/25 gauge) were used. In eight cases information on the needle size was lacking. The authors concluded that there was a substantial risk of under-grading in EUS-FNB samples of Grade 2 and Grade 3 PanNETs 43 . The very same conclusion was drawn in a small study including 10 patients and published in 2016 44 .
As an exception, in a retrospective study including 59 patients, Di Leo and co-workers reported a high tumor grading agreement (84%) comparing the Ki-67 Index in twenty-five EUS-FNB samples of various types of needles with that of available surgical specimens 45 . However, all EUS-FNB samples with a non-diagnostic yield was excluded in the above calculations making the true grading agreement most probably far lower than reported, at least if an intention to diagnose approach would have been applied. Moreover, in a majority of cases (80%) a 25-gauge EUS-FNA needle was actually used while a true 19/22-gauge EUS-FNB needle was used in only 20% of cases. Hence, in reality the above study could be considered a study rather analyzing the Ki-67 index in a mixed set of histopathology and cytopathology samples. Similarly, in the study by Kamata and colleagues, a grading concordance of 83% was reported comparing 25 gauge reverse bevel EUS-FNB with surgical specimens 46 . However, the concordance would drop significantly if an intention-to-diagnose analysis would have been performed instead of a per-protocol analysis. Paiella et al. suggested a relatively low risk of undergrading by EUS-sampling in 110 cases of PanNET, but again cases being non-diagnostic at EUS were excluded from the main analysis 47 .
In the study by Crino and co-workers mentioned above, the authors recorded a robust correlation between the Ki-67 Index in EUS-FNB samples and the Ki-67 Index in surgical specimens with only 3/77 (4%) cases being under-graded at EUS-FNB. However, in another 4 cases the Ki-67 Index could not be evaluated in the EUS-FNB samples, which thereby should be regarded as failures and incorrect grading. Interestingly, the authors also reported a close-to significant trend (p = 0.07) that the use of end-cutting FNB-needles (Fork-tip: n = 129; Franseen-tip: n = 24) was superior to the reverse-bevel FNB-needle (n = 78) in the acquisition of samples adequate for Ki-67-indexing. In contrast to the study by Crino and colleagues, we applied a prospective study design investigating one needle type only without any variation in needle size. Such a study design minimizes the number of confounding factors. Moreover, results become more easy to interpret, which in the current study equals clear support not to use the reverse bevel needle firsthand.
As in the above mentioned studies, we recorded a high degree of under-grading in EUS-FNB samples with several G2-tumors being assessed as G1-tumors. From a preoperative, management point of view, it might not be severely problematic if G2-tumors are falsely graded as G1-tumors since both groups are often candidates for surgery anyhow. Patients with a G1-tumor have a better prognosis, i.e. longer overall and disease free survival, compared with patients with a G2-tumor 48 . Nevertheless, the obviously quite sparse yield of reverse bevel EUS-FNB is indeed worrisome if to be used as a tool for the selection of patients for surgery among elderly patients or other patients with an estimated high risk-benefit of surgery in low grade tumors. The draw-back of most studies published on the topic, including the current one, is that very few study subjects harbor G3-tumors. Most probably, the reason for the lack of data in this group of patients is that only few patients with G3-tumors are referred for EUS as such since in many cases the preoperative diagnosis can be determined by other diagnostic modalities.
Others have evaluated digital quantification of Ki-67 positive cells in various neoplasms, for example in pulmonary NETs 11 and in PanNETs 10 . However, in a majority of publications including both the mentioned ones, quantification has been performed in resection specimens and not in pretreatment tumor tissue. In the study by di Leo et al. manual quantification of the Ki-67 positive nuclei in FNB samples was performed 45 as was the case in the study by Leeds and co-workers 42 . In the retrospective study by Hwang and colleagues, a digital image analyzer was indeed applied for cell counting, but the software used was different from the one used in the current work.
It could be discussed, how many groups of cohesive cells acquired by EUS-FNB that should be included in the calculation of the Ki-67 Index and the tumor grade of EUS-FNB. In the current study we decided to include www.nature.com/scientificreports/ the three largest groups of neoplastic cells. To rule out this cut-off level as a source of bias, we performed the very same calculations of the Ki-67-index also in in the ten largest groups of cohesive cells, without any significant difference in outcome, Supplementary Material. Apparently, the inclusion of as much as ten groups of neoplastic cells did not reduce the rate of under-grading. What about using EUS-FNA samples and smears for the preoperative assessment of the Ki-67 Index? In a meta-analysis published in 2016 including thirteen studies (263 cases), the pooled sensitivity of the Ki-67 Index estimated in EUS-FNA smears was 64% in discriminating G1 from G2/G3-tumors 49 . In a selected group of patients (n = 15), the assessment of the Ki-67 Index in cellblocks preparations of EUS-FNA aspirates was reported to be a promising alternative to EUS-FNB reaching an impressive 100% agreement with the Ki-67 Index of surgical specimens 50 . However, all cases with an EUS-FNA cell count of < 400 cells were excluded in the analysis. Most probably, the inclusion of these excluded cases in an intention-to-diagnose analysis would have lowered the agreement rate. The aim of the current study was not to evaluate EUS-FNA-samples for the estimation of the Ki-67-index. Nevertheless, in the patients subjected to EUS-FNA at our center during 2015-2019, the cytopathologist reported an acceptable amount of tumor cells for assessment of the Ki-67 Index in only somewhat more than half of the patients.
To the best of our knowledge, the current work is the first large prospective study analyzing the accuracy of Ki-67-indexing in EUS-FNB specimens for pretreatment grading of PanNETs. The study is also strengthened by the fact that no case was lost from follow-up. The use of the identical WHO classification system throughout the study is another advantage, which guarantees reliability of the results and facilitates the interpretation of the results. Finally, the study pathologist was blinded to the Ki-67-calculations both in surgical specimens and in EUS-FNB samples.
There are some limitations of the study. This was a single-center study, which accounts for consistency in the method used. On the other hand, the number of resected was relatively small and the results presented need to be validated in external centers. Many PanNET-patients, especially older patients with small tumors, were not subjected to surgery but instead surveillance. Therefore, and quite obviously, the Ki-67 Index of all FNB-samples could not be compared with that of corresponding surgical specimens. A potential weakness is the fact that not all eligible patients were subjected to EUS-FNB but rather to EUS-FNA because of the reasons mentioned above. A final limitation is the fact that no fixed number of FNB-passes was performed in the study. Instead, the endosonographer based the number of passes on the macroscopic yield of the FNB-core.
In conclusion, EUS-guided fine needle biopsy sampling performed with a 22-gauge reverse bevel needle is sensitive for the diagnosis of PanNET. However, the biopsy yield is sparse and the quality of the biopsy is inadequate for reliable Ki-67-indexing and grading of tumors. Most probably, that risk is at least partially due to the small size of tissue cores gained by EUS-FNB. Therefore, the studied EUS-FNB approach leads to a significant risk for under-grading of PanNETs. Not unlikely, end-cutting FNB-needles might be better options with a higher likelihood for correct tumor grading. Improvement in needle design and optimization of sampling technique is still warranted and further studies on the topic are needed.